修改FunASR实时识别框架,实时识别时2pass模式下支持框架层面返回句子级别的时间戳,单位毫秒#2216
Conversation
|
{"is_final":false,"mode":"2pass-offline","stamp_sents":[{"end":2270,"punc":",","start":430,"text_seg":"正 是 因 为 存 在 绝 对 正 义","ts_list":[[430,670],[670,810],[810,1030],[1030,1130],[1130,1330],[1330,1510],[1510,1670],[1670,1810],[1810,1970],[1970,2270]]},{"end":4535,"punc":"","start":2270,"text_seg":"所 以 我 们 接 受 现 实 的 相 对 正 义","ts_list":[[2270,2389],[2389,2490],[2490,2570],[2570,2709],[2709,2969],[2969,3310],[3310,3570],[3570,3730],[3730,3830],[3830,3969],[3969,4150],[4150,4270],[4270,4535]]}],"text":"正是因为存在绝对正义,所以我们接受现实的相对正义","timestamp":"[[430,670],[670,810],[810,1030],[1030,1130],[1130,1330],[1330,1510],[1510,1670],[1670,1810],[1810,1970],[1970,2270],[2270,2389],[2389,2490],[2490,2570],[2570,2709],[2709,2969],[2969,3310],[3310,3570],[3570,3730],[3730,3830],[3830,3969],[3969,4150],[4150,4270],[4270,4535]]","wav_name":"wav_default_id"} 如上所示,目前2pass中2pass-offline的结果是支持句子级时间戳的,辛苦详细解释下你的PR,并给一些结果实例 |
|
明白了,你的PR用的是VAD的时间戳 |
|
方便问一下,SenseVoice的热词模型是否有计划,我们这边按照文档自训练了一下SenseVoiceSmall,发现效果很差,希望能有官方的热词版支持 |
热词版本目前没有计划 |
请问,Sensevoice时间戳模型和相关的支持代码已经有了吗? |
FunASR最新版本已经支持框架级别的SenseVoice字时间戳,但是目前我这边实际运用发现点问题,还不成熟 |
|
|
This is a substantial change (18 files) to the runtime 2pass mode for sentence-level timestamps. The feature is valuable, but I cannot verify the C++ runtime build and behavior at this time. The PR also likely has conflicts with recent main branch changes. Please rebase against the latest main and ensure the build passes. If you can provide test results showing the timestamp output, we can reconsider merging. |
|
Thanks for this PR. The feature (sentence-level timestamps in 2pass streaming mode) is valuable. However, this touches 18 files in the C++ runtime and needs thorough testing with the streaming service before merge. Could you confirm:
We want to make sure this integrates cleanly with the current runtime. |



修改FunASR实时识别框架,实时识别时2pass模式下支持框架层面返回句子级别的时间戳,单位毫秒